248 research outputs found

    Defense against Adversarial Attacks Using High-Level Representation Guided Denoiser

    Full text link
    Neural networks are vulnerable to adversarial examples, which poses a threat to their application in security sensitive systems. We propose high-level representation guided denoiser (HGD) as a defense for image classification. Standard denoiser suffers from the error amplification effect, in which small residual adversarial noise is progressively amplified and leads to wrong classifications. HGD overcomes this problem by using a loss function defined as the difference between the target model's outputs activated by the clean image and denoised image. Compared with ensemble adversarial training which is the state-of-the-art defending method on large images, HGD has three advantages. First, with HGD as a defense, the target model is more robust to either white-box or black-box adversarial attacks. Second, HGD can be trained on a small subset of the images and generalizes well to other images and unseen classes. Third, HGD can be transferred to defend models other than the one guiding it. In NIPS competition on defense against adversarial attacks, our HGD solution won the first place and outperformed other models by a large margin

    Effects of stator-rotor interaction on unsteady aerodynamic load of compressor rotor blades

    Get PDF
    In compressor working, unsteady aerodynamic load induced by the interaction of stator-rotor blade rows is the main vibration source of blade high cycle fatigue. It has a direct influence on fatigue strength of compressor blades. Further research on unsteady aerodynamic load has very important significance for improving the service life and reliability of compressor blades. Based on an aero-engine compressor rotor system, three-dimensional flow field model of former stator and downstream rotor is established. With the method of numerical simulation, compressor flow characteristics are solved at different moments. Then the paper analyzes the process of stator-rotor interaction and the distribution law of rotor blade aerodynamic load. In addition, the effects on rotor blade aerodynamic load are discussed at different pressure ratios, rotational speeds and ratios of stator-rotor blade number. The results show unsteady flow field area with lower speed is induced by stator-rotor interaction at rotor blade leading edge. When the overlap space between stator and rotor channels is the maximum, mass flow and static pressure around rotor blade will appear jumping fluctuations. Unsteady aerodynamic load fluctuates periodically, and dominant frequencies are manly at frequency doubling of stator-rotor interaction, especially at one time frequency (1×f0). In the interaction period T, variations of aerodynamic load on pressure and suction surfaces take the contrary trend, magnitude and pulsation amplitude on pressure surface are far greater than that on suction surface. Effects of pressure ratio on pressure and suction surfaces are consistent, and the magnitude of aerodynamic load increases with pressure ratio. Rotational speed and stator-rotor blade number ratio affect the magnitude of aerodynamic load on suction surface more heavily than that on pressure surface. With the increasing of rotational speed, unsteady characteristics of aerodynamic load are enhanced. Besides, pulsation amplitude and peak value of unsteady aerodynamic load reach the maximum when stator-rotor blade number ratio λ=1. This research provides the theoretical basis for dynamics design of aero-engine compressor rotor syste

    Fast multiplication of random dense matrices with fixed sparse matrices

    Full text link
    This work focuses on accelerating the multiplication of a dense random matrix with a (fixed) sparse matrix, which is frequently used in sketching algorithms. We develop a novel scheme that takes advantage of blocking and recomputation (on-the-fly random number generation) to accelerate this operation. The techniques we propose decrease memory movement, thereby increasing the algorithm's parallel scalability in shared memory architectures. On the Intel Frontera architecture, our algorithm can achieve 2x speedups over libraries such as Eigen and Intel MKL on some examples. In addition, with 32 threads, we can obtain a parallel efficiency of up to approximately 45%. We also present a theoretical analysis for the memory movement lower bound of our algorithm, showing that under mild assumptions, it's possible to beat the data movement lower bound of general matrix-matrix multiply (GEMM) by a factor of M\sqrt M, where MM is the cache size. Finally, we incorporate our sketching algorithm into a randomized least squares solver. For extremely over-determined sparse input matrices, we show that our results are competitive with SuiteSparse; in some cases, we obtain a speedup of 10x over SuiteSparse

    A distributed-memory parallel algorithm for discretized integral equations using Julia

    Full text link
    Boundary value problems involving elliptic PDEs such as the Laplace and the Helmholtz equations are ubiquitous in physics and engineering. Many such problems have alternative formulations as integral equations that are mathematically more tractable than their PDE counterparts. However, the integral equation formulation poses a challenge in solving the dense linear systems that arise upon discretization. In cases where iterative methods converge rapidly, existing methods that draw on fast summation schemes such as the Fast Multipole Method are highly efficient and well established. More recently, linear complexity direct solvers that sidestep convergence issues by directly computing an invertible factorization have been developed. However, storage and compute costs are high, which limits their ability to solve large-scale problems in practice. In this work, we introduce a distributed-memory parallel algorithm based on an existing direct solver named ``strong recursive skeletonization factorization.'' The analysis of its parallel scalability applies generally to a class of existing methods that exploit the so-called strong admissibility. Specifically, we apply low-rank compression to certain off-diagonal matrix blocks in a way that minimizes data movement. Given a compression tolerance, our method constructs an approximate factorization of a discretized integral operator (dense matrix), which can be used to solve linear systems efficiently in parallel. Compared to iterative algorithms, our method is particularly suitable for problems involving ill-conditioned matrices or multiple right-hand sides. Large-scale numerical experiments are presented to demonstrate the performance of our implementation using the Julia language

    OTOV2: Automatic, Generic, User-Friendly

    Full text link
    The existing model compression methods via structured pruning typically require complicated multi-stage procedures. Each individual stage necessitates numerous engineering efforts and domain-knowledge from the end-users which prevent their wider applications onto broader scenarios. We propose the second generation of Only-Train-Once (OTOv2), which first automatically trains and compresses a general DNN only once from scratch to produce a more compact model with competitive performance without fine-tuning. OTOv2 is automatic and pluggable into various deep learning applications, and requires almost minimal engineering efforts from the users. Methodologically, OTOv2 proposes two major improvements: (i) Autonomy: automatically exploits the dependency of general DNNs, partitions the trainable variables into Zero-Invariant Groups (ZIGs), and constructs the compressed model; and (ii) Dual Half-Space Projected Gradient (DHSPG): a novel optimizer to more reliably solve structured-sparsity problems. Numerically, we demonstrate the generality and autonomy of OTOv2 on a variety of model architectures such as VGG, ResNet, CARN, ConvNeXt, DenseNet and StackedUnets, the majority of which cannot be handled by other methods without extensive handcrafting efforts. Together with benchmark datasets including CIFAR10/100, DIV2K, Fashion-MNIST, SVNH and ImageNet, its effectiveness is validated by performing competitively or even better than the state-of-the-arts. The source code is available at https://github.com/tianyic/only_train_once.Comment: Published on ICLR 2023. Remark here that a few images of dependency graphs can not be included in arXiv due to exceeding size limi
    • …
    corecore